256 PART 5 Looking for Relationships with Correlation and Regression

would assume the chance of dying from radiation exposure may depend not only

on the radiation dose received, but also on age, gender, weight, general health,

radiation wavelength, and the amount of time over which the person was exposed

to radiation. In Chapter 17, we describe how the straight-line regression model

can be generalized to handle multiple predictors. You can generalize the logistic

formula to handle multiple predictors in the same way.

Suppose that the outcome variable Y is dependent on three predictors called X, V,

and W. Then the multivariate logistic model looks like this:

Y

e

a

bX

CV

dW

1

1

/

_

)

(

Logistic regression finds the best-fitting values of the parameters a, b, c, and d

given your data. That way, for any particular set of values for X, V, and W, you can

use the equation to predict Y, which is the probability of being positive for the

outcome.

Running a Logistic Regression

Model with Software

The theory behind logistic regression is difficult to grasp, and the calculations are

complicated (see the sidebar “Getting into the nitty-gritty of logistic regression”

for details). The good news is that most statistical software (as described in

Chapter  4) can run a logistic regression model, and it is similar to running a

straight-line or multiple linear regression model (see Chapters 16 and 17). Here

are the steps:

1.

Make sure your data set has a column for the outcome variable that is

coded as 1 where the individual is positive for the outcome, and 0 when

they are negative.

If you do not have an outcome column coded this way, use the data manage-

ment commands in your software to generate a new variable coded as 0 for

those who do not have the outcome, and 1 for those who have the outcome,

as shown in Table 18-1.

2.

Make sure your data set has a column for each predictor variable, and

that these columns are coded the way you want them to be entered

them into the model.